Arabic natural language processing: An overview
نویسندگان
چکیده
Arabic is recognised as the 4th most used language of Internet. has three main varieties: (1) classical (CA), (2) Modern Standard (MSA), (3) Dialect (AD). MSA and AD could be written either in or Roman script (Arabizi), which corresponds to with Latin letters, numerals punctuation. Due complexity this number corresponding challenges for NLP, many surveys have been conducted, order synthesise work done on Arabic. However these principally focus two varieties (MSA AD, letters only), they are slightly old (no such survey since 2015) therefore do not cover recent resources tools. To bridge gap, we propose a focusing 90 research papers (74% were published after 2015). Our study presents classifies Arabic, by concentrating both Arabizi, associates each its publicly available whenever available.
منابع مشابه
An overview of empirical natural language processing.(Natural Language
In recent years, there has been a resurgence in research on empirical methods in natural language processing. These methods employ learning techniques to automatically extract linguistic knowledge from natural language corpora rather than require the system developer to manually encode the requisite knowledge. The current special issue reviews recent research in empirical methods in speech reco...
متن کاملAn Overview of Empirical Natural Language Processing
search on empirical methods in natural language processing. These methods employ learning techniques to automatically extract linguistic knowledge from natural language corpora rather than require the system developer to manually encode the requisite knowledge. The current special issue reviews recent research in empirical methods in speech recognition, syntactic parsing, semantic processing, i...
متن کاملArabic Natural Language Processing for Information Retrieval
Human Language Technology has played a big role in implementing Latin based information retrieval systems. Two of the most sited techniques are stemming and truncation. Numerous studies have showed that the inflectional structure of words has a big impact on the retrieval accuracy of Latin-based languages information retrieval systems (IRS). Stemming or truncation is done for two principal reas...
متن کاملIntroduction to Arabic Natural Language Processing
This book provides system developers and researchers in natural language processing and computational linguistics with the necessary background information for working with the Arabic language. The goal is to introduce Arabic linguistic phenomena and review the state-of-the-art in Arabic processing. The book discusses Arabic script, phonology, orthography, morphology, syntax and semantics, with...
متن کاملAn Overview of Probabilistic Tree Transducers for Natural Language Processing
Probabilistic finite-state string transducers (FSTs) are extremely popular in natural language processing, due to powerful generic methods for applying, composing, and learning them. Unfortunately, FSTs are not a good fit for much of the current work on probabilistic modeling for machine translation, summarization, paraphrasing, and language modeling. These methods operate directly on trees, ra...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of King Saud University - Computer and Information Sciences
سال: 2021
ISSN: ['2213-1248', '1319-1578']
DOI: https://doi.org/10.1016/j.jksuci.2019.02.006